12 research outputs found
Recommended from our members
Randomization Inference for Outcomes with Clumping at Zero
In randomized experiments, randomization forms the “reasoned basis for inference.” While randomization inference is well developed for continuous and binary outcomes, there has been comparatively little work for outcomes with nonnegative support and clumping at zero. Typically outcomes of this type have been modeled using parametric models that impose strong distributional assumptions. This article proposes new randomization inference procedures for nonnegative outcomes with clumping at zero. Instead of making distributional assumptions, we propose various assumptions about the nature of response to treatment. Our methods form a set of nonparametric methods for outcomes that are often described as zero-inflated. These methods are illustrated using two randomized trials where job training interventions were designed to increase earnings of participants
Recommended from our members
Conducting sparse feature selection on arbitrarily long phrases in text corpora with a focus on interpretability
We propose a general framework for topic-specific summarization of large text corpora, and illustrate how it can be used for analysis in two quite different contexts: an OSHA database of fatality and catastrophe reports (to facilitate surveillance for patterns in circumstances leading to injury or death) and legal decisions on workers’ compensation claims (to explore relevant case law). Our summarization framework, built on sparse classification methods, is a compromise between simple word frequency based methods currently in wide use, and more heavyweight, model-intensive methods such as Latent Dirichlet Allocation (LDA). For a particular topic of interest (e.g., mental health disability, or carbon monoxide exposure), we regress a labeling of documents onto the high-dimensional counts of all the other words and phrases in the documents. The resulting small set of phrases found as predictive are then harvested as the summary. Using a branch-and-bound approach, this method can be extended to allow for phrases of arbitrary length, which allows for potentially rich summarization. We discuss how focus on the purpose of the summaries can inform choices of tuning parameters and model constraints. We evaluate this tool by comparing computational time and summary statistics of the resulting word lists to three other methods in the literature. We also present a new R package, textreg. Overall, we argue that sparse methods have much to offer text analysis, and is a branch of research that should be considered further in this context
Recommended from our members
Adjusting treatment effect estimates by post-stratification in randomized experiments
Experimenters often use post-stratification to adjust estimates. Post-stratification is akin to blocking, except that the number of treated units in each stratum is a random variable because stratification occurs after treatment assignment. We analyse both post-stratification and blocking under the Neyman–Rubin model and compare the efficiency of these designs. We derive the variances for a post-stratified estimator and a simple difference-in-means estimator under different randomization schemes. Post-stratification is nearly as efficient as blocking: the difference in their variances is of the order of 1/n2, with a constant depending on treatment proportion. Post-stratification is therefore a reasonable alternative to blocking when blocking is not feasible. However, in finite samples, post-stratification can increase variance if the number of strata is large and the strata are poorly chosen. To examine why the estimators’ variances are different, we extend our results by conditioning on the observed number of treated units in each stratum. Conditioning also provides more accurate variance estimates because it takes into account how close (or far) a realized random sample is from a comparable blocked experiment. We then show that the practical substance of our results remains under an infinite population sampling model. Finally, we provide an analysis of an actual experiment to illustrate our analytical results
Concise comparative summaries (CCS) of large text corpora with a human experiment
In this paper we propose a general framework for topic-specific summarization of large text corpora and illustrate how it can be used for the analysis of news databases. Our framework, concise comparative summarization (CCS), is built on sparse classification methods. CCS is a lightweight and flexible tool that offers a compromise between simple word frequency based methods currently in wide use and more heavyweight, model-intensive methods such as latent Dirichlet allocation (LDA). We argue that sparse methods have much to offer for text analysis and hope CCS opens the door for a new branch of research in this important field.
For a particular topic of interest (e.g., China or energy), CSS automatically labels documents as being either on- or off-topic (usually via keyword search), and then uses sparse classification methods to predict these labels with the high-dimensional counts of all the other words and phrases in the documents. The resulting small set of phrases found as predictive are then harvested as the summary.
To validate our tool, we, using news articles from the New York Times international section, designed and conducted a human survey to compare the different summarizers with human understanding. We demonstrate our approach with two case studies, a media analysis of the framing of “Egypt” in the New York Times throughout the Arab Spring and an informal comparison of the New York Times’ and Wall Street Journal’s coverage of “energy.” Overall, we find that the Lasso with L2 normalization can be effectively and usefully used to summarize large corpora, regardless of document size.Statistic
A Conditional Randomization Test to Account for Covariate Imbalance in Randomized Experiments
We consider the conditional randomization test as a way to account for covariate imbalance in randomized experiments. The test accounts for covariate imbalance by comparing the observed test statistic to the null distribution of the test statistic conditional on the observed covariate imbalance. We prove that the conditional randomization test has the correct significance level and introduce original notation to describe covariate balance more formally. Through simulation, we verify that conditional randomization tests behave like more traditional forms of covariate adjustmet but have the added benefit of having the correct conditional significance level. Finally, we apply the approach to a randomized product marketing experiment where covariate information was collected after randomization
Implementing Risk-Limiting Post-Election Audits in California
Risk-limiting post-election audits limit the chance of certifying an electoral outcome if the outcome is not what a full hand count would show. Building on previous work, we report on pilot risk-limiting audits in four elections during 2008 in three California counties: one during the February 2008 Primary Election in Marin County and three during the November 2008 General Elections in Marin, Santa Cruz and Yolo Counties. We explain what makes an audit risk-limiting and how existing and proposed laws fall short. We discuss the differences among our four pilot audits. We identify challenges to practical, efficient risk-limiting audits and conclude that current approaches are too complex to be used routinely on a large scale. One important logistical bottleneck is the difficulty of exporting data from commercial election management systems in a format amenable to audit calculations. Finally, we propose a bare-bones risk-limiting audit that is less efficient than these pilot audits, but avoids many practical problems
Recommended from our members
Three Statistical Methods for the Social Sciences
Social sciences offer particular challenges to statistics due to difficulties such as conducting randomized experiments in this domain, the large variation in humans, the difficulty in collecting complete datasets, and the typically unstructured nature of data at the human scale. New technology allows for increased computation and data recording, which has in turn brought forth new innovations for analysis.Because of these challenges and innovations, statistics in the social sciences is currently thriving and vibrant.This dissertation is an argument for evaluating statistical methodology in the social sciences along four major axes: \emph{validity}, \emph{interpretability}, \emph{transparency}, and \emph{employability}. We illustrate how one might develop methods that achieve these four goals with three case studies.The first is an analysis of post-stratification, a form of covariate adjustment to evaluate treatment effect. In contrast to recent results showing that regression adjustment can be problematic under the Neyman-Rubin model, we show post-stratification, something that can easily done in, e.g., natural experiments, has a similar precision to a randomized block trail as long as there are not too many strata. The difference is . Post-stratification thus potentially allows for transparently exploiting predictive covariates and random mechanisms in observational data. This case study illustrates the value of analyzing a simple estimator under weak assumptions, and of finding similarities between different methodological approaches so as to leverage earlier findings to a new domain.We then present a framework for building statistical tools to extract topic-specific key-phrase summaries of large text corpora (e.g., the New York Times) and a human validation experiment to determine best practices for this approach. These tools, built from high-dimensional, sparse classifiers such as L1-logistic regression and the Lasso, can be used to, for example, translate essential concepts across languages, investigate massive databases of aviation reports, or understand how different topics of interest are covered by various media outlets. This case study demonstrates how more modern methods can be evaluated using external validation in order to demonstrate that they produce meaningful and comprehendible results that can be broadly used.The third chapter presents the trinomial bound, a new auditing technique for elections rooted in very minimal assumptions. We demonstrated the usability of this technique by, in November 2008, auditing contests in Santa Cruz and Marin counties, California.The audits were risk-limiting, meaning they had a pre-specified minimum chance of requiring a full hand count if the outcomes were wrong. The trinomial bound gave better results than the Stringer bound, a tool common in accounting for analyzing financial audit samples drawn with probability proportional to an error bound. This case study focuses on generating methods that are employable and transparent so as to serve a public need.Throughout, we argue that, especially in the difficult domain of the social sciences, we must spend extra attention on the first axis of validity. This motivates our using the Neyman-Rubin model for the analysis of post-stratification, our developing an approach for external, model-independent validation for the key-phrase extraction tools, and our minimal assumptions for election auditing
Recommended from our members
To Adjust or Not to Adjust? Sensitivity Analysis of M-Bias and Butterfly-Bias
“M-Bias”, as it is called in the epidemiological literature, is the bias introduced by conditioning on a pretreatment covariate due to a particular “M-Structure” between two latent factors, an observed treatment, an outcome, and a “collider”. This potential source of bias, which can occur even when the treatment and the outcome are not confounded, has been a source of considerable controversy. We here present formulae for identifying under which circumstances biases are inflated or reduced. In particular, we show that the magnitude of M-Bias in Gaussian linear structural equation models tends to be relatively small compared to confounding bias, suggesting that it is generally not a serious concern in many applied settings. These theoretical results are consistent with recent empirical findings from simulation studies. We also generalize the M-Bias setting to allow for the correlation between the latent factors to be nonzero, and to allow for the collider to also be a confounder between the treatment and the outcome. These results demonstrate that mild deviations from the M-Structure tend to increase confounding bias more rapidly than M-bias, suggesting that choosing to condition on any given covariate is generally the superior choice. As an application, we re-examine a controversial example between Professors Donald Rubin and Judea Pearl.Statistic
Recommended from our members
Decomposing Treatment Effect Variation
Understanding and characterizing treatment effect variation in randomized experiments has become essential for going beyond the "black box" of the average treatment effect. Nonetheless, traditional statistical approaches often ignore or assume away such variation. In the context of a randomized experiment, this paper proposes a framework for decomposing overall treatment effect variation into a systematic component that is explained by observed covariates, and a remaining idiosyncratic component. Our framework is fully randomization-based, with estimates of treatment effect variation that are fully justified by the randomization itself. Our framework can also account for noncompliance, which is an important practical complication. We make several key contributions. First, we show that randomization-based estimates of systematic variation are very similar in form to estimates from fully-interacted linear regression and two stage least squares. Second, we use these estimators to develop an omnibus test for systematic treatment effect variation, both with and without noncompliance. Third, we propose an -like measure of treatment effect variation explained by covariates and, when applicable, noncompliance. Finally, we assess these methods via simulation studies and apply them to the Head Start Impact Study, a large-scale randomized experiment
Recommended from our members
Randomization inference for treatment effect variation
Applied researchers are increasingly interested in whether and how treatment effects vary in randomized evaluations, especially variation that is not explained by observed covariates. We propose a model-free approach for testing for the presence of such unexplained variation. To use this randomization-based approach, we must address the fact that the average treatment effect, which is generally the object of interest in randomized experiments, actually acts as a nuisance parameter in this setting. We explore potential solutions and advocate for a method that guarantees valid tests in finite samples despite this nuisance. We also show how this method readily extends to testing for heterogeneity beyond a given model, which can be useful for assessing the sufficiency of a given scientific theory. We finally apply our method to the National Head Start impact study, which is a large-scale randomized evaluation of a Federal preschool programme, finding that there is indeed significant unexplained treatment effect variation